auto encoder
AnoRand: A Semi Supervised Deep Learning Anomaly Detection Method by Random Labeling
Mayaki, Mansour Zoubeirou A, Riveill, Michel
Anomaly detection or more generally outliers detection is one of the most popular and challenging subject in theoretical and applied machine learning. The main challenge is that in general we have access to very few labeled data or no labels at all. In this paper, we present a new semi-supervised anomaly detection method called \textbf{AnoRand} by combining a deep learning architecture with random synthetic label generation. The proposed architecture has two building blocks: (1) a noise detection (ND) block composed of feed forward ferceptron and (2) an autoencoder (AE) block. The main idea of this new architecture is to learn one class (e.g. the majority class in case of anomaly detection) as well as possible by taking advantage of the ability of auto encoders to represent data in a latent space and the ability of Feed Forward Perceptron (FFP) to learn one class when the data is highly imbalanced. First, we create synthetic anomalies by randomly disturbing (add noise) few samples (e.g. 2\%) from the training set. Second, we use the normal and the synthetic samples as input to our model. We compared the performance of the proposed method to 17 state-of-the-art unsupervised anomaly detection method on synthetic datasets and 57 real-world datasets. Our results show that this new method generally outperforms most of the state-of-the-art methods and has the best performance (AUC ROC and AUC PR) on the vast majority of reference datasets. We also tested our method in a supervised way by using the actual labels to train the model. The results show that it has very good performance compared to most of state-of-the-art supervised algorithms.
An FNet based Auto Encoder for Long Sequence News Story Generation
Mandal, Paul K., Mahto, Rakeshkumar
In this paper, we design an auto encoder based off of Google's FNet Architecture in order to generate text from a subset of news stories contained in Google's C4 dataset. We discuss previous attempts and methods to generate text from autoencoders and non LLM Models. FNET poses multiple advantages to BERT based encoders in the realm of efficiency which train 80% faster on GPUs and 70% faster on TPUs. We then compare outputs of how this autencoder perfroms on different epochs. Finally, we analyze what outputs the encoder produces with different seed text.
Variational Autoencoders
I'm making written guides on generative deep learning . In my last guide we went through a simple auto encoder, for each image we generated 16 codes and regenerated the image using only 16 numbers,now let's talk about Variational auto encoders which has a similar architecture but they belong to probabilistic graphical models . In our VAE model our encoder generates two parameters, mean and variance we make a sample from that and pass it to decoder to rebuild the input . We can not write VAEs like simple auto encoders because we have a more complex loss function, so how can we make a VAE model?
Semi-supervised Anomaly Detection using Auto Encoders
In this article, I'll be discussing a paper [1] that proposes an AutoEncoder based approach for the task of semi-supervised anomaly detection. If you want to look at the GitHub repository link, results and conclusion directly, please scroll to the bottom of the article. Anomaly detection refers to the task of finding unusual instances that stand out from the normal data [1]. The non-conforming patterns can be referred to using different names depending on the application area/domain, such as anomalies, outliers, exceptions, defects, containments, etc. [2] In several applications, these outliers or anomalous samples are of greater interest compared to the normal ones. Specifically in the case of industrial surface inspection and infrastructure asset management, finding defects (anomalous regions) is of extreme importance.
Anomaly Detection from Head and Abdominal Fetal ECG -- A Case study of IOT anomaly detection using Generative Adversarial Networks
This DoNut Network contains uses The variational auto-encoder ("Auto-Encoding Variational Bayes",Kingma, D.P. and Welling) which is a deep Bayesian network, with observed variable x and latent variable z. The VAE is generated using TFSnippet (library for writing and testing tensorflow models). The generative process of Auto-Encoder is initiated with parameter z with prior distribution p(z), and a hidden network h(z), then uses observed variable x with distribution p(x h(z)). The posterior inference p(z x), variational inference techniques are adopted, to train a separated distribution q(z h(x)). Here each Sequential function creates a multi-layer perception, with 2 hidden layers of 50 units and RELU activation.
Self Training Autonomous Driving Agent
Kotyan, Shashank, Vargas, Danilo Vasconcellos, U, Venkanna
Today, as the world is ushering into the era of automating things, one of key product of the industries which is yet to be automated completely is a vehicle which is also intelligent. According to Pcmag, "An autonomous vehicle is a computer-controlled car that drives itself" [1]. There are many industries which are leading the research in autonomous vehicles, some of the most prominent are Google and Tesla. Research nowadays is inclining towards becoming leader in the new age of driver-less car. However, the current driver-less car is yet far from being the intelligent autonomous car. Current autonomous vehicles uses LiDAR (Light Detection And Ranging) technology, that measures distance to a target by illuminating the target with pulsed laser light and measuring the reflected pulses with a sensor. This measuring and analysis helps the autonomous vehicles to keep them in track. But, the various limitations associated with the LiDAR are: a) High dependency on prevention of object collision rather than driving.
Deep Learning Demystified
Guest blog post by Christopher Dole and other contributors, originally posted here. Deep Learning is one of the most revolutionary and disruptive technologies ever developed in Data Science. Essentially, this is a class of algorithms inspired by how the human brain works, and it has the ability to automate and replace most of the world's jobs. This is what enables self-driving cars to function and what allows Spotify to create very customized playlists and recommendations. This is how YouTube is able to identify faces and animals in videos and how Siri can understand and process free speech in milliseconds.
Dimensionality reduction methods for molecular simulations
Doerr, Stefan, Ariz-Extreme, Igor, Harvey, Matthew J., De Fabritiis, Gianni
Molecular simulations produce very high-dimensional data-sets with millions of data points. As analysis methods are often unable to cope with so many dimensions, it is common to use dimensionality reduction and clustering methods to reach a reduced representation of the data. Yet these methods often fail to capture the most important features necessary for the construction of a Markov model. Here we demonstrate the results of various dimensionality reduction methods on two simulation data-sets, one of protein folding and another of protein-ligand binding. The methods tested include a k-means clustering variant, a non-linear auto encoder, principal component analysis and tICA. The dimension-reduced data is then used to estimate the implied timescales of the slowest process by a Markov state model analysis to assess the quality of the projection. The projected dimensions learned from the data are visualized to demonstrate which conformations the various methods choose to represent the molecular process.
Deep Learning Demystified
Guest blog post by Christopher Dole and other contributors, originally posted here. Deep Learning is one of the most revolutionary and disruptive technologies ever developed in Data Science. Essentially, this is a class of algorithms inspired by how the human brain works, and it has the ability to automate and replace most of the world's jobs. This is what enables self-driving cars to function and what allows Spotify to create very customized playlists and recommendations. This is how YouTube is able to identify faces and animals in videos and how Siri can understand and process free speech in milliseconds.